302 research outputs found

    Extent and distribution of linkage disequilibrium in the Old Order Amish

    Get PDF
    Knowledge of the extent and distribution of linkage disequilibrium (LD) is critical to the design and interpretation of gene mapping studies. Because the demographic history of each population varies and is often not accurately known, it is necessary to empirically evaluate LD on a population-specific basis. Here we present the first genome-wide survey of LD in the Old Order Amish (OOA) of Lancaster County Pennsylvania, a closed population derived from a modest number of founders. Specifically, we present a comparison of LD between OOA individuals and US Utah participants in the International HapMap project (abbreviated CEU) using a high-density single nucleotide polymorphism (SNP) map. Overall, the allele (and haplotype) frequency distributions and LD profiles were remarkably similar between these two populations. For example, the median absolute allele frequency difference for autosomal SNPs was 0.05, with an inter-quartile range of 0.02–0.09, and for autosomal SNPs 10–20 kb apart with common alleles (minor allele frequency≥0.05), the LD measure r 2 was at least 0.8 for 15 and 14% of SNP pairs in the OOA and CEU, respectively. Moreover, tag SNPs selected from the HapMap CEU sample captured a substantial portion of the common variation in the OOA (∼88%) at r 2 ≥0.8. These results suggest that the OOA and CEU may share similar LD profiles for other common but untyped SNPs. Thus, in the context of the common variant-common disease hypothesis, genetic variants discovered in gene mapping studies in the OOA may generalize to other populations. Genet. Epidemiol . 34: 146–150, 2010. © 2009 Wiley-Liss, Inc.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/64895/1/20444_ftp.pd

    KL Estimation of the Power Spectrum Parameters from the Angular Distribution of Galaxies in Early SDSS Data

    Get PDF
    We present measurements of parameters of the 3-dimensional power spectrum of galaxy clustering from 222 square degrees of early imaging data in the Sloan Digital Sky Survey. The projected galaxy distribution on the sky is expanded over a set of Karhunen-Loeve eigenfunctions, which optimize the signal-to-noise ratio in our analysis. A maximum likelihood analysis is used to estimate parameters that set the shape and amplitude of the 3-dimensional power spectrum. Our best estimates are Gamma=0.188 +/- 0.04 and sigma_8L = 0.915 +/- 0.06 (statistical errors only), for a flat Universe with a cosmological constant. We demonstrate that our measurements contain signal from scales at or beyond the peak of the 3D power spectrum. We discuss how the results scale with systematic uncertainties, like the radial selection function. We find that the central values satisfy the analytically estimated scaling relation. We have also explored the effects of evolutionary corrections, various truncations of the KL basis, seeing, sample size and limiting magnitude. We find that the impact of most of these uncertainties stay within the 2-sigma uncertainties of our fiducial result.Comment: Fig 1 postscript problem correcte

    Deep-coverage whole genome sequences and blood lipids among 16,324 individuals.

    Get PDF
    Large-scale deep-coverage whole-genome sequencing (WGS) is now feasible and offers potential advantages for locus discovery. We perform WGS in 16,324 participants from four ancestries at mean depth >29X and analyze genotypes with four quantitative traits-plasma total cholesterol, low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol, and triglycerides. Common variant association yields known loci except for few variants previously poorly imputed. Rare coding variant association yields known Mendelian dyslipidemia genes but rare non-coding variant association detects no signals. A high 2M-SNP LDL-C polygenic score (top 5th percentile) confers similar effect size to a monogenic mutation (~30 mg/dl higher for each); however, among those with severe hypercholesterolemia, 23% have a high polygenic score and only 2% carry a monogenic mutation. At these sample sizes and for these phenotypes, the incremental value of WGS for discovery is limited but WGS permits simultaneous assessment of monogenic and polygenic models to severe hypercholesterolemia

    Ischemic stroke risk, smoking, and the genetics of inflammation in a biracial population: the stroke prevention in young women study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Although cigarette smoking is a well-established risk factor for vascular disease, the genetic mechanisms that link cigarette smoking to an increased incidence of stroke are not well understood. Genetic variations within the genes of the inflammatory pathways are thought to partially mediate this risk. Here we evaluate the association of several inflammatory gene single nucleotide polymorphisms (SNPs) with ischemic stroke risk among young women, further stratified by current cigarette smoking status.</p> <p>Methods</p> <p>A population-based case-control study of stroke among women aged 15–49 identified 224 cases of first ischemic stroke (47.3% African-American) and 211 age-comparable control subjects (43.1% African-American). Several inflammatory candidate gene SNPs chosen through literature review were genotyped in the study population and assessed for association with stroke and interaction with smoking status.</p> <p>Results</p> <p>Of the 8 SNPs (across 6 genes) analyzed, only <it>IL6 </it>SNP rs2069832 (allele C, African-American frequency = 92%, Caucasian frequency = 55%) was found to be significantly associated with stroke using an additive model, and this was only among African-Americans (age-adjusted: OR = 2.2, 95% CI = 1.0–5.0, p = 0.049; risk factor adjusted: OR = 2.5, 95% CI = 1.0–6.5, p = 0.05). When stratified by smoking status, two SNPs demonstrated statistically significant gene-environment interactions. First, the T allele (frequency = 5%) of <it>IL6 </it>SNP rs2069830 was found to be protective among non-smokers (OR = 0.30, 95% CI = 0.11–.082, p = 0.02), but not among smokers (OR = 1.63, 95% CI = 0.48–5.58, p = 0.43); genotype by smoking interaction (p = 0.036). Second, the C allele (frequency = 39%) of <it>CD14 </it>SNP rs2569190 was found to increase risk among smokers (OR = 2.05, 95% CI = 1.09–3.86, p = 0.03), but not among non-smokers (OR = 0.93, 95% CI = 0.62–1.39, p = 0.72); genotype by smoking interaction (p = 0.039).</p> <p>Conclusion</p> <p>This study demonstrates that inflammatory gene SNPs are associated with early-onset ischemic stroke among African-American women (<it>IL6</it>) and that cigarette smoking may modulate stroke risk through a gene-environment interaction (<it>IL6 and CD14</it>). Our finding replicates a prior study showing an interaction with smoking and the C allele of <it>CD14 </it>SNP rs2569190.</p

    Genomic evaluations with many more genotypes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genomic evaluations in Holstein dairy cattle have quickly become more reliable over the last two years in many countries as more animals have been genotyped for 50,000 markers. Evaluations can also include animals genotyped with more or fewer markers using new tools such as the 777,000 or 2,900 marker chips recently introduced for cattle. Gains from more markers can be predicted using simulation, whereas strategies to use fewer markers have been compared using subsets of actual genotypes. The overall cost of selection is reduced by genotyping most animals at less than the highest density and imputing their missing genotypes using haplotypes. Algorithms to combine different densities need to be efficient because numbers of genotyped animals and markers may continue to grow quickly.</p> <p>Methods</p> <p>Genotypes for 500,000 markers were simulated for the 33,414 Holsteins that had 50,000 marker genotypes in the North American database. Another 86,465 non-genotyped ancestors were included in the pedigree file, and linkage disequilibrium was generated directly in the base population. Mixed density datasets were created by keeping 50,000 (every tenth) of the markers for most animals. Missing genotypes were imputed using a combination of population haplotyping and pedigree haplotyping. Reliabilities of genomic evaluations using linear and nonlinear methods were compared.</p> <p>Results</p> <p>Differing marker sets for a large population were combined with just a few hours of computation. About 95% of paternal alleles were determined correctly, and > 95% of missing genotypes were called correctly. Reliability of breeding values was already high (84.4%) with 50,000 simulated markers. The gain in reliability from increasing the number of markers to 500,000 was only 1.6%, but more than half of that gain resulted from genotyping just 1,406 young bulls at higher density. Linear genomic evaluations had reliabilities 1.5% lower than the nonlinear evaluations with 50,000 markers and 1.6% lower with 500,000 markers.</p> <p>Conclusions</p> <p>Methods to impute genotypes and compute genomic evaluations were affordable with many more markers. Reliabilities for individual animals can be modified to reflect success of imputation. Breeders can improve reliability at lower cost by combining marker densities to increase both the numbers of markers and animals included in genomic evaluation. Larger gains are expected from increasing the number of animals than the number of markers.</p
    corecore